Goto

Collaborating Authors

 compose domain-specific transformation


Learning to Compose Domain-Specific Transformations for Data Augmentation

Neural Information Processing Systems

Data augmentation is a ubiquitous technique for increasing the size of labeled training sets by leveraging task-specific data transformations that preserve class labels. While it is often easy for domain experts to specify individual transformations, constructing and tuning the more sophisticated compositions typically needed to achieve state-of-the-art results is a time-consuming manual task in practice. We propose a method for automating this process by learning a generative sequence model over user-specified transformation functions using a generative adversarial approach. Our method can make use of arbitrary, non-deterministic transformation functions, is robust to misspecified user input, and is trained on unlabeled data. The learned transformation model can then be used to perform data augmentation for any end discriminative model. In our experiments, we show the efficacy of our approach on both image and text datasets, achieving improvements of 4.0 accuracy points on CIFAR-10, 1.4 F1 points on the ACE relation extraction task, and 3.4 accuracy points when using domain-specific transformation operations on a medical imaging dataset as compared to standard heuristic augmentation approaches.


Reviews: Learning to Compose Domain-Specific Transformations for Data Augmentation

Neural Information Processing Systems

This paper addresses an interesting and new problem to augment training data in a learnable and principled manner. Modern machine learning systems are known for their'hunger for data' and until now state-of-the-art approaches have relied mainly on heuristics to augment labeled training data. This paper tries to reduce the tedious task of finding a good combination of data augmentation strategies with best parameters by learning a sequence of best data augmentation strategies in a generative adversarial framework while working with unsupervised data. The motivation behind the problem is to reduce human labor without compromising the final discriminative classification performance. The problem formulation is pretty clear from the text.


Learning to Compose Domain-Specific Transformations for Data Augmentation

Ratner, Alexander J., Ehrenberg, Henry, Hussain, Zeshan, Dunnmon, Jared, Ré, Christopher

Neural Information Processing Systems

Data augmentation is a ubiquitous technique for increasing the size of labeled training sets by leveraging task-specific data transformations that preserve class labels. While it is often easy for domain experts to specify individual transformations, constructing and tuning the more sophisticated compositions typically needed to achieve state-of-the-art results is a time-consuming manual task in practice. We propose a method for automating this process by learning a generative sequence model over user-specified transformation functions using a generative adversarial approach. Our method can make use of arbitrary, non-deterministic transformation functions, is robust to misspecified user input, and is trained on unlabeled data. The learned transformation model can then be used to perform data augmentation for any end discriminative model.